64 research outputs found
Excursion Risk
The risk and return profiles of a broad class of dynamic trading strategies,
including pairs trading and other statistical arbitrage strategies, may be
characterized in terms of excursions of the market price of a portfolio away
from a reference level. We propose a mathematical framework for the risk
analysis of such strategies, based on a description in terms of price
excursions, first in a pathwise setting, without probabilistic assumptions,
then in a Markovian setting.
We introduce the notion of delta-excursion, defined as a path which deviates
by delta from a reference level before returning to this level. We show that
every continuous path has a unique decomposition into delta-excursions, which
is useful for the scenario analysis of dynamic trading strategies, leading to
simple expressions for the number of trades, realized profit, maximum loss and
drawdown. As delta is decreased to zero, properties of this decomposition
relate to the local time of the path.
When the underlying asset follows a Markov process, we combine these results
with Ito's excursion theory to obtain a tractable decomposition of the process
as a concatenation of independent delta-excursions, whose distribution is
described in terms of Ito's excursion measure. We provide analytical results
for linear diffusions and give new examples of stochastic processes for
flexible and tractable modeling of excursions. Finally, we describe a
non-parametric scenario simulation method for generating paths whose excursion
properties match those observed in empirical data.Comment: 36 pages; 10 figure
Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing
Motivated by practical considerations in machine learning for financial
decision-making, such as risk-aversion and large action space, we initiate the
study of risk-aware linear bandits. Specifically, we consider regret
minimization under the mean-variance measure when facing a set of actions whose
rewards can be expressed as linear functions of (initially) unknown parameters.
Driven by the variance-minimizing G-optimal design, we propose the Risk-Aware
Explore-then-Commit (RISE) algorithm and the Risk-Aware Successive Elimination
(RISE++) algorithm. Then, we rigorously analyze their regret upper bounds to
show that, by leveraging the linear structure, the algorithms can dramatically
reduce the regret when compared to existing methods. Finally, we demonstrate
the performance of the algorithms by conducting extensive numerical experiments
in a synthetic smart order routing setup. Our results show that both RISE and
RISE++ can outperform the competing methods, especially in complex
decision-making scenarios
Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon
We explore reinforcement learning methods for finding the optimal policy in
the linear quadratic regulator (LQR) problem. In particular, we consider the
convergence of policy gradient methods in the setting of known and unknown
parameters. We are able to produce a global linear convergence guarantee for
this approach in the setting of finite time horizon and stochastic state
dynamics under weak assumptions. The convergence of a projected policy gradient
method is also established in order to handle problems with constraints. We
illustrate the performance of the algorithm with two examples. The first
example is the optimal liquidation of a holding in an asset. We show results
for the case where we assume a model for the underlying dynamics and where we
apply the method to the data directly. The empirical evidence suggests that the
policy gradient method can learn the global optimal solution for a larger class
of stochastic systems containing the LQR framework and that it is more robust
with respect to model mis-specification when compared to a model-based
approach. The second example is an LQR system in a higher dimensional setting
with synthetic data.Comment: 49 pages, 9 figure
Policy gradient methods find the Nash equilibrium in N-player general-sum linear-quadratic games
We consider a general-sum N-player linear-quadratic game with stochastic dynamics over a finite horizon and prove the global convergence of the natural policy gradient method to the Nash equilibrium. In order to prove convergence of the method we require a certain amount of noise in the system. We give a condition, essentially a lower bound on the covariance of the noise in terms of the model parameters, in order to guarantee convergence. We illustrate our results with numerical experiments to show that even in situations where the policy gradient method may not converge in the deterministic setting, the addition of noise leads to convergence
- …